Raising the Interlingual Ceiling with Multilingual Text Generation
نویسندگان
چکیده
In a typical interlingual machine translation (MT) system, the tasks of text planning and content selection are not explicitly performed. Rather, they are assumed to be implicit in the interlingual representation derived in the source language analysis phase. This simplifies the task of target language generation, but greatly restricts the flexibility of the resulting text. Recent MT systems have pushed the level of the interlingua to higher, more semantic/pragmatic levels of representation, and have correspondingly included some text planning capabilities in their generation phases, but the process remains closely bound by the structure and content of the source text. This places a “ceiling” on the extent of the capability of interlingual systems. In this paper we argue that in the general case, the structure and content of “parallel” texts need not be the same. We then propose that multilingual text generation (MLG) has the potential to address this phenomenon by allowing a more flexible variation of text structure and of content selection in a manner unrestricted by a “source” text. This work is partially supported by the Engineering and Physical Sciences Research Council (EPSRC) Grant J19221, the Commission of the European Union Grant LRE-62009, and by BC/DAAD ARC Project 293.
منابع مشابه
ULiS: An Expert System on Linguistics to Support Multilingual Management of Interlingual Knowledge bases
We are interested in bridging the world of natural language and the world of the semantic web in particular to support multilingual access to the web of data, and multilingual management of interlingual knowledge bases. In this paper we introduce the ULiS project, that aims at designing a pivot-based NLP technique called Universal Linguistic System, 100% using the semantic web formalisms, and b...
متن کاملSemantic Annotation for Interlingual Representation of Multilingual Texts
This paper describes the annotation process being used in a multi-site project to create six sizable bilingual parallel corpora annotated with a consistent interlingua representation. After presenting the background and objectives of the effort, we describe the multilingual corpora and the three stages of interlingual representation being developed. We then focus on the annotation process itsel...
متن کاملInterlingual Annotation of Parallel Text Corpora: A New Framework for Annotation and Evaluation
This paper focuses on the next step in the creation of a system of meaning representation and the development of semantically-annotated parallel corpora, for use in applications such as machine translation, question answering, text summarization, and information retrieval. The work described below constitutes the first effort of any kind to provide parallel corpora annotated with detailed deep ...
متن کاملAn Expert System on Linguistics to Support Natural Multilingual Collaborative Management of Interlingual Semantic Web Knowledge bases
We are interested in bridging the world of natural language and the world of the semantic web in particular to support multilingual access to the web of data. In this paper we introduce the ULiS project, that aims at designing a pivot-based NLP technique called Universal Linguistic System, 100% using the semantic web formalisms, and being compliant with the Meaning-Text theory. Through the ULiS...
متن کاملInterlingual Annotation Of Multilingual Text Corpora
This paper describes a multi-site project to annotate six sizable bilingual parallel corpora for interlingual content. After presenting the background and objectives of the effort, we will go on to describe the data set that is being annotated, the interlingua representation language used, an interface environment that supports the annotation task and the annotation process itself. We will then...
متن کامل